Testing Dimension Reduction Methods for Text Retrieval
نویسنده
چکیده
In this paper, we compare performance of several dimension reduction techniques, namely LSI, random projections and FastMap. The qualitative comparison is based on rank lists and evaluated on a subset of TREC 5 collection and corresponding TREC 8 ad-hoc queries. Moreover, projection times and intrinsic dimensionality were measured to present a common baseline for methods’ usability.
منابع مشابه
Dimension Reduction Methods of Text Documents by Neural Networks
The paper is oriented to introduce different dimension reduction methods in the text document retrieval area. First, the mostly used text document retrieval models are described, and then in second part the analytical approach and neural network approaches to dimension reduction of keyword space are described. Dimension reduction methods reduce keyword space into much smaller size together with...
متن کاملAnalysis of unsupervised dimensionality reduction techniques
Domains such as text, images etc contain large amounts of redundancies and ambiguities among the attributes which result in considerable noise effects (i.e. the data is high dimension). Retrieving the data from high dimensional datasets is a big challenge. Dimensionality reduction techniques have been a successful avenue for automatically extracting the latent concepts by removing the noise and...
متن کاملDimension reduction based on centroids and least squares for efficient processing of text data
Dimension reduction in today’s vector space based information retrieval system is essential for improving computational efficiency in handling massive data. In our previous work we proposed a mathematical framework for lower dimensional representations of text data in vector space based information retrieval, and a couple of dimension reduction method using minimization and matrix rank reductio...
متن کاملLower Dimensional Representation of Text Data in Vector
Dimension reduction in today's vector space based information retrieval system is essential for improving computational eeciency in handling massive data. In this paper, we propose a mathematical framework for lower dimensional representation of text data in vector space based information retrieval using minimization and matrix rank reduction formula. We illustrate how the commonly used Latent ...
متن کاملText Document Retrieval by Document Space Dimension Reduction with Feed-Forward Neural Networks
The paper deals with text document retrieval from the given document collection by using neural networks, namely cascade neural network model, linear and nonlinear Hebbian neural networks and linear autoassociative neural network. With using neural networks it is possible to reduce the dimension of the search space with preserving the highest retrieval accuracy.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005